Explicit duration modeling for Cantonese connected-digit recognition
نویسندگان
چکیده
This paper describes a study on using explicit duration models in hidden Markov model (HMM) based Cantonese connecteddigit recognition. An HMM does not give explicit control to the temporal structure of speech. As a result, the recognition output may exhibit unreasonable duration pattern, which is often accompanied with the presence of recognition errors. We propose to use a duration model that models the relative duration of the tail part of a Cantonese digit, together with conventional word-level duration models. The duration models are integrated into the Viterbi search algorithm for speech recognition. Experimental results show that proposed method leads to substantial reduction of recognition errors, especially for slowly spoken utterances.
منابع مشابه
Using Duration Information in Cantonese Connected-Digit Recognition
This paper presents an investigation on the use of explicit statistical duration models for Cantonese connected-digit recognition. Cantonese is a major Chinese dialect. The phonetic compositions of Cantonese digits are generally very simple. Some of them contain only a single vowel or nasal segment. This makes it difficult to attain high accuracy in the automatic recognition of Cantonese digit ...
متن کاملDuration Modeling in Mandarin Connected Digit Recognition
Digit string recognition is required in many applications which need to recognize numbers such as telephone numbers, credit card numbers, date, etc. In order to design a high performance recognizer, duration information is explored in this study. In a Mandarin connected digit recognizer, insertion and deletion errors amount to more than two thirds of the total recognition errors because there e...
متن کاملDurational modelling for improved connected digit recognition
A durational modelling technique is proposed for CDHMM-based connected digit recognition. This reduces the insertion error rate, which is typically the most frequent recognition error observed when no grammar constraint is applied. Insertion errors can be attributed in part to the acknowledged weakness of the acoustic models for accurate temporal modeling of speech signals. Two forms of duratio...
متن کاملNovel filler acoustic models for connected digit recognition
The context-dependent modeling technique is extended to include non-speech ller segments occurring between speech word units. In addition to the conventional context-dependent word or subword units, the proposed acoustic modeling provides an e cient way of accounting for the effects of the surrounding speech on the inter-word non-speech segments, especially for small vocabulary recognition task...
متن کاملDuration modeling using cumulative duration probability and speaking rate compensation
A duration modeling scheme and a speaking rate compensation technique are presented for the HMM based connected digit recognizer. The proposed duration modeling technique uses a cumulative duration probability. The cumulative duration probability also can be used to obtain the duration bounds for the bounded duration modeling. One of the advantages of proposed technique is that the cumulative d...
متن کامل